Formulate a null hypothesis \(H_0\) (\(\gamma = 0\))
Compute a statistic that follows a known distribution under the null hypothesis \[t=\frac{\hat{\gamma}}{SE(\hat{\gamma})} \sim \mathcal{T}(n-2)\] for the slope of a linear regression
Compute the probability that a value \(\geq |t|\) would have been obtained under the null hypothesis = p-value
Reject the Null\(H_0\) if p-value is less than some threshold
What threshold should I use ? what threshold should I use if I test a lot of hypothesis ?
Type I and Type II errors
Hypothesis testing is about making decision about the null hypothesis
Truth / Decision
H0
H1
H0
Correct
Type I Error (\(\alpha\))
H1
Type II Error (\(\beta\))
Correct
\(\alpha\) is the probability of type I Error (False Positive Rate). This is the criterion typically used to declare statistical significance. Usually set at 0.05.,
\(\beta\) is the probability of making a type II Error (False Negative Rate) 1
Type I and Type II errors are strongly linked.
If we are ready to take more risks (increase type I Error), we can call more significant results
Type I and Type II errors: Experiment
Assume the true value of my parameter \(\gamma\) is 2. Assume its standard error is 1 and the degrees of freedom of the model is 20. My estimator\(\hat{\gamma}\) follows approximately a t-distribution with a non-centrality parameter (ncp) = \(\gamma=2\)
Create a function that takes as input a \(t\) value and a number of degrees of freedom and outputs the p-value of the corresponding \(t\)-test for the null \(\gamma = 0\). Use the function pt from R.
Using the R function rt, simulate 10,000 values for \(\hat{\gamma}\) (corresponding to the performing 10,000 independent experiments)
Draw an histogram of the simulated values, and compare it to the null distribution
Compute the corresponding p-values using the function you just created
plot the proportion of significant tests as a function of the type I Error ranging from 0.001 to 0.5
R Code
Code
par(mfrow=c(1,2))my_df =20gamma =2pvalue =function(t, ddl) {return(2*pt(t,df=ddl,lower.tail=FALSE))}gamma_hat =rt(10000,df = my_df, ncp=gamma)hist(gamma_hat,freq=F,xlim=c(-10,10),main='')xx.g =seq(-10,10,0.01)lines(xx.g,dt(xx.g,df=my_df),col=2)pv =pvalue(gamma_hat, ddl=my_df)type.1.err =seq(0.00,0.5,0.001)power =c()for (th in type.1.err) { power =c(power, mean(pv<th))}plot(type.1.err,power,type='l', xlab='type I Error', ylab='Power')
Type I Error and Multiple tests
Say we perform \(m\) hypothesis tests, what is the probability of at least 1 false positive ?
P(making an error for one test) = \(\alpha\)
P(Not making an error for one test) = \((1-\alpha)\)
P(Not making an error for m test) = \((1-\alpha)^m\)
P(Making at least one error in m tests) = \(1 - (1-\alpha)^m\)
Write an R function that computes the probability of making at least one error, as a function of the number of tests \(m\), and the risk \(\alpha\). Plot the results for \(m\) between 1 and 100 and \(\alpha=0.05\)
where \(F\) denotes a cumulative distribution function (CDF)
\(F_X(x) = P(X \leq x)\). To control FDR at level \(\alpha\) we want to find the threshold \(\theta\) such that \[\frac{\pi_0\theta}{F(\theta)}=\alpha\]